Asymptotics of Canonical and Saturated RNA Secondary Structures
نویسندگان
چکیده
It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366 . n(-3/2) . 2.618034(n). In this paper, we study combinatorial asymptotics for two special subclasses of RNA secondary structures - canonical and saturated structures. Canonical secondary structures are defined to have no lonely (isolated) base pairs. This class of secondary structures was introduced by Bompfünewerer et al., who noted that the run time of Vienna RNA Package is substantially reduced when restricting computations to canonical structures. Here we provide an explanation for the speed-up, by proving that the asymptotic number of canonical RNA secondary structures is 2.1614 . n(-3/2) . 1.96798(n) and that the expected number of base pairs in a canonical secondary structure is 0.31724 . n. The asymptotic number of canonical secondary structures was obtained much earlier by Hofacker, Schuster and Stadler using a different method. Saturated secondary structures have the property that no base pairs can be added without violating the definition of secondary structure (i.e. introducing a pseudoknot or base triple). Here we show that the asymptotic number of saturated structures is 1.07427 . n(-3/2) . 2.35467(n), the asymptotic expected number of base pairs is 0.337361 . n, and the asymptotic number of saturated stem-loop structures is 0.323954 . 1.69562(n), in contrast to the number 2(n - 2) of (arbitrary) stem-loop structures as classically computed by Stein and Waterman. Finally, we apply the work of Drmota to show that the density of states for [all resp. canonical resp. saturated] secondary structures is asymptotically Gaussian. We introduce a stochastic greedy method to sample random saturated structures, called quasi-random saturated structures, and show that the expected number of base pairs is 0.340633 . n.
منابع مشابه
Relation Between RNA Sequences, Structures, and Shapes via Variation Networks
Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...
متن کاملAsymptotic number of hairpins of saturated RNA secondary structures.
In the absence of chaperone molecules, RNA folding is believed to depend on the distribution of kinetic traps in the energy landscape of all secondary structures. Kinetic traps in the Nussinov energy model are precisely those secondary structures that are saturated, meaning that no base pair can be added without introducing either a pseudoknot or base triple. In this paper, we compute the asymp...
متن کاملCentral and local limit theorems for RNA structures.
A k-noncrossing RNA pseudoknot structure is a graph over {1,...,n} without 1-arcs, i.e. arcs of the form (i,i+1) and in which there exists no k-set of mutually intersecting arcs. In particular, RNA secondary structures are 2-noncrossing RNA structures. In this paper we prove a central and a local limit theorem for the distribution of the number of 3-noncrossing RNA structures over n nucleotides...
متن کاملCombinatorics of Saturated Secondary Structures of RNA
Following Zuker (1986), a saturated secondary structure for a given RNA sequence is a secondary structure such that no base pair can be added without violating the definition of secondary structure, e.g., without introducing a pseudoknot. In the Nussinov-Jacobson energy model (Nussinov and Jacobson, 1980), where the energy of a secondary structure is -1 times the number of base pairs, saturated...
متن کاملThe Expected Order of Saturated RNA Secondary Structures
Abstract Over the last 30 years the development of RNA secondary structure prediction algorithms have been guided and inspired by corresponding combinatorial studies where the RNA molecules are modeled as certain kind of planar graphs. The other way round, new algorithmic ideas gave rise to interesting combinatorial problems asking for a deeper understanding of the structures processed. One suc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of bioinformatics and computational biology
دوره 7 5 شماره
صفحات -
تاریخ انتشار 2009